-
-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(mep): support time bucketing in queries #2937
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2937 +/- ##
==========================================
- Coverage 92.93% 92.88% -0.05%
==========================================
Files 635 635
Lines 29069 29127 +58
==========================================
+ Hits 27016 27056 +40
- Misses 2053 2071 +18
Continue to review full report at Codecov.
|
|
||
def process_query(self, query: Query, query_settings: QuerySettings) -> None: | ||
granularity = self.__get_granularity(query) | ||
query.add_condition_to_ast( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is just a matter of adding the condition to the query? Is there no existing granularity condition to be removed/replaced?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that's correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is that though? What was the mechanism by which the granularity was being set on the query previously and wouldn't that still be happening?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it wasn't being set previously on generic_metrics unless you added a where
clause with the granularity column
@@ -180,7 +180,7 @@ def test_raw_tags(self) -> None: | |||
AND tags_raw[{tag_key}] = '{value_as_string}' | |||
AND timestamp >= toDateTime('{self.start_time}') | |||
AND timestamp < toDateTime('{self.end_time}') | |||
GRANULARITY 1 | |||
GRANULARITY 60 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I follow why these queries changed? Are you saying they were wrong before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to remove that second condition so now they will be wrong, but I just wanted to verify that we're handling the "raw" granularities correctly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which one are you saying is the correct value and which one is the wrong one? And if you're saying it will be wrong after this PR can you give some info about why and how will it be addressed in future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I originally put up the PR, I was thinking that both providing raw and enum values were correct. I've reconsidered and I now think we should accept only the "raw" (in seconds) granularity value to reduce confusion.
Our ideal is to take this non-infrastructure concern and move it to the product side (e.g. sentry should maintain the list of enum values for granularities it cares about and use them directly in queries). They don't have the flexibility in their query layer at the moment, though, to have two different approaches for building requests.
How it will be addressed in the future is undetermined but they're aware that we don't like having this logic in our side.
|
||
def __init__( | ||
self, | ||
accepted_granularities: Sequence[Tuple[int, int]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you pass these in? Everywhere you are using this class it is using the constants above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually really like that this processor is not tied to the specific granularity values we happen to be storing in this table for metrics. Even though it's only used in one place today I can imagine us experimenting with different granularities in different tables in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because "generic" metrics supports arbitrary granularities depending on the need of the customer, I wanted to make it flexible (e.g. we can support 10s, but performance doesn't use it, so we're not mapping it here)
accepted_granularities=PERFORMANCE_GRANULARITIES, | ||
default_granularity_enum=DEFAULT_MAPPED_GRANULARITY_ENUM, | ||
), | ||
TimeSeriesProcessor({"bucketed_time": "timestamp"}, ("timestamp",)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need this processor if you have the granularity processor above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, because the product wants to round timestamps to odd intervals like 15m or 3h or whatever which isn't handled by the granularity processor alone
add support for bucketed time and handling the mapping of
granularity : enum value
in query processing@wmak @ahmedetefy for visibility